Machine Translation As A Testbed For Multilingual Analysis
نویسندگان
چکیده
We propose that machine translation (MT) is a useful application for evaluating and deriving the development of NL components, especially in a wide-coverage analysis system. Given the architecture of our MT system, which is a transfer system based on linguistic modules, correct analysis is expected to be a prerequisite for correct translation, suggesting a correlation between the two, given relatively mature transfer and generation components. We show through error analysis that there is indeed a strong correlation between the quality of the translated output and the subjectively determined goodness of the analysis. We use this correlation as a guide for development of a coordinated parallel analysis effort in 7 languages.
منابع مشابه
Transculturation and Multilingual Lives: Writing between Languages and Cultures
This paper looks at the issues of transculturation as explored in auto and semi-autobiographical accounts of linguistic and cultural transitions. The paper also addresses a number of questions about the structure of these texts, the authors’ linguistic competences, as well as questions about the theoretical and conceptual tool which may help us to discuss the issues the writers are reflecting o...
متن کاملResource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts
This paper presents an evaluation of the use of machine translation to obtain and employ data for training multilingual sentiment classifiers. We show that the use of machine translated data obtained similar results as the use of native-speaker translations of the same data. Additionally, our evaluations pinpoint to the fact that the use of multilingual data, including that obtained through mac...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملA testbed for developing multilingual phonotactic descriptions
This paper presents a testbed for developing multilingual phonotactic descriptions that employs finite state methods to represent the phonotactics of one or more languages. The motivation for this work is to make an extensive range of phonotactic descriptions of varying granularity available for speech technology applications. We discuss the design of the phonotactic testbed and how various mod...
متن کاملGrammar Sharing Techniques for Rule-based Multilingual NLP Systems
Rule-based multilingual natural language processing (NLP) applications such as machine translation systems require the development of grammars for multiple languages. Grammar writing, however, is often a slow and laborious process. In this paper we describe a methodology for multilingual and multipurpose grammar development based on grammar sharing. This paper presents the first step towards a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002